Model Selection

High-precision quantization

# High-precision quantization

Gama-12B is a large language model supporting multiple languages, offering various quantized versions to meet different performance and precision requirements.

Large Language Model

Transformers Supports Multiple Languages

Acereason Nemotron 1.1 7B GGUF

A high-performance 7B parameter language model launched by NVIDIA, focusing on mathematical and code reasoning tasks and supporting a 128k context length.

Large Language Model Supports Multiple Languages

lmstudio-community

Delta Vector Austral 24B Winton GGUF

A quantized version of the Austral-24B-Winton model of Delta-Vector, quantized using the llama.cpp tool, suitable for efficient operation on different hardware configurations.

Large Language Model English

Openbuddy OpenBuddy R1 0528 Distill Qwen3 32B Preview0 QAT GGUF

This is the quantized version of OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT. With quantization technology, the model can run more efficiently under different hardware conditions.

Large Language Model Supports Multiple Languages

Infly Inf O1 Pi0 GGUF

A quantized version based on the infly/inf-o1-pi0 model, supporting multilingual text generation tasks, optimized with llama.cpp's imatrix quantization.

Large Language Model Supports Multiple Languages

Pocketdoc Dans PersonalityEngine V1.3.0 24b GGUF

A multilingual, multi-purpose large language model supporting various professional domains and general tasks, suitable for role-playing, story creation, programming, and other scenarios.

Large Language Model

Gryphe Pantheon Proto RP 1.8 30B A3B GGUF

This is a quantized version based on the Gryphe/Pantheon-Proto-RP-1.8-30B-A3B model, using llama.cpp for quantization, suitable for role-playing and text generation tasks.

Large Language Model English

Huihui Ai Qwen3 14B Abliterated GGUF

Qwen3-14B-abliterated is a quantized version based on the Qwen3-14B model, optimized using llama.cpp, offering multiple quantization options to meet different performance requirements.

Large Language Model

Mlabonne Qwen3 14B Abliterated GGUF

This is the quantized version of the Qwen3-14B-abliterated model, quantized using llama.cpp's imatrix option, suitable for text generation tasks.

Large Language Model

Qwen Qwen3 32B GGUF

Quantized version based on Qwen/Qwen3-32B, using llama.cpp for quantization, supporting multiple quantization types for different hardware requirements.

Large Language Model

Nvidia OpenMath Nemotron 14B Kaggle GGUF

This is a 14B-parameter large mathematical language model open-sourced by NVIDIA. It has been quantized by llama.cpp and can run efficiently under different hardware conditions.

Large Language Model English

Mistral Small 24B Instruct 2501 GGUF

Mistral-Small-24B-Instruct-2501 is a 24B-parameter instruction-finetuned large language model supporting multilingual text generation tasks.

Large Language Model Supports Multiple Languages

Pocketdoc Dans SakuraKaze V1.0.0 12b GGUF

Llamacpp imatrix quantized version based on PocketDoc/Dans-SakuraKaze-V1.0.0-12b, supporting multiple quantization types, suitable for text generation tasks.

Large Language Model English

Llama 3.3 70B Instruct Abliterated GGUF

A 70B-parameter large language model based on the Llama 3.3 architecture, supporting multilingual text generation tasks, optimized through quantization for various hardware environments

Large Language Model Supports Multiple Languages

Zero Mistral 24B Gguf

Zero-Mistral-24B is a large language model based on the Mistral architecture, supporting Russian and English, suitable for dialogue and text generation tasks.

Large Language Model Supports Multiple Languages

Google Gemma 3 27b It Qat GGUF

A quantized version based on Google Gemma 3's 27-billion parameter instruction-tuned model, generated using quantization-aware training (QAT) weights, supporting multiple quantization levels to meet different hardware requirements.

Large Language Model

Nvidia Llama 3 1 Nemotron Ultra 253B V1 GGUF

This is the quantized version of the NVIDIA Llama-3_1-Nemotron-Ultra-253B-v1 model, quantized using llama.cpp, supporting multiple quantization types and suitable for various hardware environments.

Large Language Model English

Llama 4 Scout 17B 16E Instruct GGUF

Llama-4-Scout-17B-16E-Instruct is a multilingual instruction fine-tuning model that supports multiple languages and can be run through LlamaEdge.

Large Language Model

Transformers Supports Multiple Languages

Qwen Qwen2.5 VL 32B Instruct GGUF

Qwen2.5-VL-32B-Instruct is a multimodal vision-language model with a parameter scale of 32B, supporting image understanding and text generation tasks.

Text-to-Image English

Gemma 3 R1984 27B Q6 K GGUF

GGUF format model converted from VIDraft/Gemma-3-R1984-27B, supporting multilingual text generation

Large Language Model Supports Multiple Languages

Mlabonne Gemma 3 12b It Abliterated GGUF

Quantized version based on mlabonne/gemma-3-12b-it-abliterated model, using llama.cpp for imatrix quantization, suitable for text generation tasks.

Large Language Model

Gemma 3 12b It Q8 0 GGUF

This model is converted from google/gemma-3-12b-it to GGUF format, suitable for the llama.cpp framework.

Large Language Model

Gemma 3 27b It GGUF

Gemma-3-27b-it-GGUF is a quantized version based on Google's Gemma-3-27b-it model, suitable for image text-to-text tasks.

Gemma 3 4b It GGUF

Gemma-3-4b-it-GGUF is a quantized version of Google's Gemma-3-4b-it model, enabling it to run on LlamaEdge and suitable for image-text to text conversion tasks.

Rombo Org Rombo LLM V3.1 QWQ 32b GGUF

Rombo-LLM-V3.1-QWQ-32b is a 32B-parameter large language model, processed with llama.cpp's imatrix quantization, offering multiple quantization versions to accommodate different hardware requirements.

Large Language Model

Thedrummer Skyfall 36B V2 GGUF

Skyfall-36B-v2 is a 36B-parameter large language model, processed with llama.cpp's imatrix quantization, offering multiple quantized versions to accommodate different hardware requirements.

Large Language Model

L3.3 MS Nevoria 70b GGUF

A quantized version based on the Steelskull/L3.3-MS-Nevoria-70b model, using llama.cpp for imatrix quantization, supporting multiple quantization levels for different hardware environments.

Large Language Model

Sky T1 32B Preview GGUF

Sky-T1-32B-Preview is a 32B-parameter large language model, quantized using llama.cpp's imatrix, suitable for text generation tasks.

Large Language Model English

EXAONE 3.5 32B Instruct GGUF

EXAONE-3.5-32B-Instruct is a large language model with 32B parameters, supporting instruction following and dialogue tasks.

Large Language Model Supports Multiple Languages

Impish Mind 8B GGUF

Quantized version based on SicariusSicariiStuff/Impish_Mind_8B model, processed with llama.cpp tools for various quantization methods, suitable for text generation tasks.

Large Language Model English

Mini Magnum 12b V1.1 GGUF

Mini-Magnum-12B-V1.1 is a text generation model built on the basis of the intervitens/mini-magnum-12b-v1.1 base model, supporting English and adopting a specific quantization method.

Large Language Model English

Gemma 2 27b It Q8 0 GGUF

This is a GGUF format model converted from Google's Gemma 2B model, suitable for text generation tasks.

Large Language Model

Darksapling V2 Ultra Quality 7B GGUF

A version completely remerged and remade based on the Dark Sapling V2 7B model, with a 32k context length, featuring ultra-high quality and 32-bit improvement

Large Language Model English

Llama 3 Cat 8b Instruct V1 GGUF

This is an 8B parameter instruction fine-tuned model based on Meta's Llama 3 architecture, processed with GGUF quantization, suitable for resource-constrained environments.

Large Language Model

Mixtral 8x22B V0.1 GGUF

Quantized version of Mixtral-8x22B-v0.1, using llama.cpp for quantization, supporting multiple languages and quantization types.

Large Language Model Supports Multiple Languages

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase